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asynchronous CDMA systems significant 
lize the benefits from chip-asynchronousL 



I. Introduction 

In Part I of this paper JTJ, we analyzed asynchronous CDMA systems with random spreading sequences 
in terms of spectral efficiency constrained to a given chip pulse waveform and in terms of SINR at the 
output of an optimum linear multiuser detector. The analysis showed that under realistic conditions, chip- 

y outperform chip-synchronous CDMA systems. In order to uti- 
CDMA, we need efficient algorithms to cope with multiuser de- 
tection for chip-asynchronous users. Therefore, in part II of this work, we focus on the generalization of 
known design rules for low-complexity multiuser detectors to chip-asynchronous CDMA. 

A unified framework for the design and analysis of multiuser detectors that admit a multistage repre- 
sentation for synchronous users was given in O. The class of multiuser detectors that admit a multistage 
representation is large and includes popular linear multiuser detectors like linear MMSE detectors (e.g. 0), 
reduced rank multistage Wiener filters Si, 0, polynomial expansion detectors [0 or conjugate gradient 
methods (e.g. 0), linear parallel interference cancellers (PIC, e.g. 0, 0), eventually weighted (e.g. ifTOlO . 
and the single-user matched filters. Multistage detectors are constructed around the matched filter concept. 
They consist of a projection of the signal into a subspace of the whole signal space by successive matched 
filtering and re-spreading followed by a linear filter in the subspace. 

Multistage detectors based on universal weights have been proposed in [fTTI . Ifl2l for CDMA systems in 
AWGN channels and extended to more realistic scenarios in lfT3l . [fl4|. 0. These references make use of the 
self-averaging properties of large random matrices to find universal weighting coefficients for the linear filter 
in the subspace. More specifically, the universal weights are obtained by approximating the precise weights 
designed according to some optimality criterion with asymptotically optimum weights, i.e. the optimum 
weights for a CDMA system whose number of users and spreading factor tend to infinity with constant ratio. 
Thanks to the properties of random matrices, asymptotically, these weights become independent of the users' 
spreading sequences and depend only on few macroscopic system parameters, as the system load or number 
of transmitted symbols per chip, the variance of the noise, and the distribution of the fading. In this way, the 
weight design for long-code CDMA simplifies considerably, its complexity becomes independent of both 
the number of users in the system and the spreading factor. Moreover, the weights need updating only when 
the macroscopic system parameters change. 

1 As already shown in Part I of this paper 1 1], asynchronism is beneficial when the relative delays between users are not integer multiples of a 
chip interval. To emphasize this requirement we use the term chip-asynchronism instead of asynchronism. 
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The fact that users are not received in a time- synchronized manner at the receiver causes two main prob- 
lems from a signal processing perspective: (i) the need for an infinite observation window to implement a 
linear MMSE detector and (ii) the potential need for oversampling to form sufficient discrete-time statistics. 
The need for an infinite observation window is primarily related to asynchronism on the symbol-level, not 
the chip-level. This aspect was addressed in |fT5ll , |[T6l where it was found that multistage detectors need not 
have infinite observation windows and can be efficiently implemented without windowing at all. A detailed 
overview of the state of art about statistics, sufficient or not, for multiuser CDMA systems and how to form 
them was addressed in Part I of this paper [1]. In part I we presented general results with the only constraint 
that the sampled noise at the output of the front-end was white. For the sake of clarity and to get insights into 
systems of practical interests, in this part II we focus on two groups of statistics implementable in practical 
systems: 

(A) Sufficient statistics obtained by filtering the received signal by a lowpass filter with bandwidth -Blow 
larger than the chip-pulse bandwidth and subsequent sampling at rate 2B LOW . 

(B) Statistics obtained by sampling the output of a filter matched to the chip waveform at the chip rate (chip 
rate sampling). In this case, the sampling instants need to be synchronized with the time delay of each 
user of interest. Thus, different statistics for each user are required. Additionally, the chip pulses at the 
output of matched filter need to satisfy the Nyquist criterion. In the following we refer to them as root 
Nyquist chip-pulse waveforms. 

General results for the design of linear multistage detectors with both kind of statistics are provided in this 
work. The chip pulse waveforms are assumed to be identical for all users. 

For asynchronous CDMA, low-complexity detectors with universal weights are conveniently designed for 
statistics ([A]). In fact, these observables enable a joint processing of all users without loss of information. 
Multistage detectors with universal weights and statistics (A) have a complexity order per bit equal to 0(rK) 
if the sampling rate is jr. On the contrary, discretization scheme (O provides different observables for each 
user and does not allow for simultaneous joint detection of all users. An implementation of multistage 
detectors with universal weights using such statistics implies a complexity order per bit equal to 0(K 2 ). 
This approach is still interesting from a complexity point of view if detection of a single user is required. 
However, it suffers from a performance degradation due to the sub-optimality of the statistics. 

This work is organized in six additional sections. S ection HTl and Hill introduce the notation and the system 
model for asynchronous CDMA, respectively. In Section [TV] multistage detectors for asynchronous CDMA 
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are reviewed and a implementation which does not suffer from truncation effects is given. The design of 
universal weighting is addressed in Section |Vj Finally, the analytical results are applied to gain further 
insight into the system in Section [VI] where methods for pulse- shaping, forming sufficient statistics and 
synchronization are compared. Conclusions are summed up in Section IVIIl 



II. Notation and Some Useful Definitions 

Throughout Part II we adopt the same notation and definitions already introduced in Part I of this work 
[0Q]. hi order to make Part II self-contained we repeat here definitions useful in this part. Upper and lower 
boldface symbols are used respectively for matrices and vectors corresponding to signals spanning a specific 
symbol interval m. Matrices and vectors describing signals spanning more than a symbol interval are denoted 
by upper boldface calligraphic letters. 

In the following, we utilize unitary Fourier transforms both in the continuous time and in the discrete 
time domain. The unitary Fourier transform of a function f(t) in the continuous time domain is given 
by F{uS) = J f(t)e~^dt. The unitary Fourier transform of a sequence {. . . , c_i, Co, c±, . . .} in the 



discrete time domain is given by c(f2) = -7= ^2n=-oo c « e ^ n ■ We will refer to them shortly as Fourier 



transform. We denote the argument of a Fourier transform of a continuous function by uj and the argument 
of a Fourier transform of a sequence by Vt. They are the angular frequency and the normalized angular 
frequency, respectively. A function in Vt is periodic with respect to integer multiples of 2n. 

For further studies it is convenient to define the concept of r-block-wise circulant matrices of order N . 

Definition 1 Let r and N be positive integers. An r-block-wise circulant matrix of order N is an rN x N 
matrix of the form 

B B 1 B n ^i 



C 



\ B 1 B 2 



B 



N-2 



B 



(1) 



/ 



with Bi = (cij, c 2 



1 ^z,ti • • • 1 "-t,i; 



In the matrix C an r x N block row is obtained by circularly right shift of the previous block. Since the 
matrix C is univocally defined by the unitary Fourier transforms of the sequences {c s0 , c Si i, . . . c Si jv-i}, for 
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s = 



l...r, 




-jQk 



S = 



there exists a bijection $ from the frequency dependent vector c(f2) = [ci(fi), c 2 (f2), . . . , c r (f2)] to C Thus, 



Furthermore, the superscripts ■ T , - H , and •*, denote the transpose, the conjugate transpose, and the con- 
jugate of the matrix argument, respectively. I n is the identity matrix of size n x n and C, Z, Z + , N, and 
M are the fields of complex, integer, nonnegative integers, natural, and real numbers, respectively, tr(-) is 
the trace of the matrix argument and span(ui, v 2 , . . . , v s ) denotes the vector space spanned by the s vectors 
v 1 ,v 2 , . . . v s . diag(. . .) : C n — > C nXn transforms an n-dimensional vector v into a diagonal matrix of size n 
having as diagonal elements the components of v in the same order. E{-} and Pr{-} are the expectation and 
probability operators, respectively. 5y is the Kronecker symbol and 5{\) is the Dirac's delta function, mod 
denotes the modulus and |_-J is the operator that yields the maximum integer not greater than its argument. 



In this section we recall briefly the system model for asynchronous CDMA introduced in Section IV and 
VII of Part I of this work [1]. The reader interested in the details of the derivation can refer to JTJ. 

Let us consider an asynchronous CDMA system with K active users in the uplink channel with spreading 
factor N . Each user and the base station are equipped with a single antenna. The channel is flat fading 
and impaired by additive white Gaussian noise with power spectral density N . The symbol interval is 
denoted with T s and T c = is the chip interval. The modulation of all users is based on the same chip 
pulse waveform ip(t) bandlimited with bandwidth B, unitary Fourier transform ^(uj), and energy E$ = 



The time delays of the K users are denoted with r^, k = 1, . . . , K. Without loss of generality we can 
assume (i) user 1 as reference user so that T\ = 0, (ii) the users ordered according to increasing time delay 
with respect to the reference user, i.e. T\ < r 2 < . . . < t k ; (iii) the time delay to be, at most, one symbol 
interval so that r fc e [0, T s )i 

As for the results presented in Part I, the mathematical results presented in this second part hold for any 
front-end that keeps the sampled noise white at its output. However, in order to get better insights into 

2 For a thorough discussion on this assumption the reader can refer to t3l . 
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c = d{c(n)}. 



(2) 



III. System Model 



rjm\ 2 dt. 



the physical system we focus on two front-ends of practical and theoretical interest. Both of them satisfy 
the more general assumption underlying the results in Part I. We refer to them as Front-end Type A and 
Front-end Type B£. 
Front-end Type A consists of 

• An ideal lowpass filter with cut-off frequency u — ^ where r e Z + satisfies the constraint B < 
such that the sampling theorem applies. The filter is normalized to obtain a unit overall amplification 
factor, i.e. the transfer function is 



G{u>) = < V** (3) 

o M > f c . 

• A subsequent continuous-discrete time conversion by sampling at rate jr. 

This front-end satisfies the conditions of the sampling theorem and, thus, provides sufficient discrete-time 
statistics. For convenience, the sampling rate is an integer multiple of the chip rate. Additionally, the 
discrete-time noise process is white with zero mean and variance a 2 = Jr^fr- 
Front-end Type B consists of 

_ i 

• A filter G(u) matched to the chip pulse and normalized to the chip pulse energy, i.e. G{uS) = ty* (to)E^ 2 ; 

• Subsequent sampling at the chip rate. 

When used with root Nyquist chip pulses, the discrete time noise process {w\p\ } is white with variance 

For a synchronous systems with square root Nyquist chip pulses, this front end provides sufficient statistics 

whereas the observables are not sufficient if the system is asynchronous. 

The chip waveform at the filter output is denoted by <p{t) and its unitary Fourier transform by $(u;). The 
well-known relations <p{t) = ip(t) * g(t) and $(u;) = ^(lu)G(uj) hold. The unitary Fourier transform of the 
chip pulse waveform cj){t) sampled at rate i and delay r is given by 

1 +oo 

r ) ± 1 ^ {U+27rs) $* (^2)) . (4) 

c s=— oo 

Sufficient statistics for asynchronous CDMA require an infinite observation window. In the following, we 
introduce a matrix system model corresponding to an infinite observation window. 

5 For the sake of compactness of some of the results, we adopt a different normalization from the one in Part I. Here, the signal energy at the 
output of the front-end is equal to one. In Part I, the energy of the analog filter's impulse response is normalized to unity. The variance of the 
sampled noise at the front-end output changes accordingly. 
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Let us denote with b^ m ' and y( m ) the vectors of transmitted and received signals at time instants m G Z. 
The baseband discrete-time asynchronous system is given by 

y = TiB + W (5) 

where ?=[..., y^" 1 ^, ?/ m ) T , y^ 1 ) 7 . . .] T and B = [. . . , b (m " 1)T , b (m)T , b (m+1)T . . .] T are infinite- 
dimensional vectors of received and transmitted symbols respectively; W is an infinite-dimensional noise 
vector; and 7i is a bi-diagonal block matrix of infinite size given by 



n 



Hf- 1] 



H (m) H (m+1) q 



(6) 



Here, and -H"^ are matrices of size r N x K obtained by the decomposition of the 2rN x K matrix 
H^ m) into two parts such that H {m) = [H { ™ )T ', lf( m)T ] r . For if< m > the relation 

jyr(m) = 5 M A (7) 

holds where A is the K x K diagonal matrix of the received amplitudes a k and is the 2rN x /<" matrix 
whose A;-th column accounts for the spreading of the symbol transmitted by user k in the symbol interval m 
and due to the actual spreading sequence, the channel delay, and filtering and sampling at the front-end. We 
refer to it as the matrix of virtual spreading. More specifically, the matrix of virtual spreading is given by 



(8) 



where s k m ^ is the ^-dimensional column vector of the spreading sequence of user k for the transmitted 
symbol m and <& k is the 2r N x N matrix taking into account the effects of the chip pulse shape and the time 



delay r k of user k. Let us decompose r k in r k 



and r k = r k — T c r k = r k mod T c , the integer number 



of chips the signal is delayed and its delay within a chip, respectively. The matrix <fr fc is of the form 



(9) 



where Tfe and 0N-r k are zero matrices of dimensions r k x N and (N — r k ) x N, respectively; <fr k is an 
r-block-wise circulant matrix of order N as in © 



= £(c(r fc )), 



(10) 
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with 



C(r k ) = Tfc)0(fi, ffc - . . . , Tfc 



(r-l)Tc 



r 



Thus, the virtual spreading sequences are the samples of the delayed continuous-time spreading waveforms 
at sampling rate r/T c . 

Throughout this work we assume that the transmitted symbols are uncorrelated and identically distributed 
random variables with unitary variance and zero mean, i.e. E(B) = O and E(BB H ) = X being O and 
X the unlimited zero vector and the unlimited identity matrix, respectively. The elements of the spreading 
sequences are assumed to be zero mean i.i.d. Gaussian random variables over all the users, chips, and 
symbols with E{s^s^ H } = -h In- Finally, Wjj. denotes that column of the matrix 7i containing the k th 
column of the matrix H^ m \ We define the correlation matrices T = 7i7t H and Ti. = 7t H 7t. The system 
load j3 = j- is the number of transmitted symbols per chip. 

IV. Multistage Structures for Asynchronous CDMA 

We consider the large class of linear multistage detectors for asynchronous CDMA. Let Xlh C^) ^ e me 
Krylov subspace [fTTll of rank L G Z + given by 



where w { k ' is the L-dimensional vector of weight coefficients. 

It has been shown in |fT6ll that, given the weight vector the detection of the symbol by the 
multistage detector of rank L in (fl"2l) can be performed with finite delay L using the implementation scheme 
in Figure [TJ Although infinite length vectors and infinite dimension matrices appear in (fT2l) . the multistage 
detector in Figure \T\ implements exactly (fl"2l) and does not suffer from truncation effects. Equivalently, the 
multistage detector in FigureCQcan be considered as a multistage detector processing data over an observation 
window of size 2L. The projection of the received vector 3^ onto the subspaces \lI(7~L), for k — 1 ... K, 
is performed jointly for all users and requires only multiplications between vectors and matrices. The size 
of those vectors and matrices does not depend on the observation window. For further details the interested 
reader is referred to lfT6l . |[T8l . 
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xS(W)=Bpan(T^^)fc 



(11) 



A multistage detector of rank IgZ + for user k is given by 




(12) 



9 



r ^ 

matched 
filtering 
H H (n) 



K(l:K,n) H y 



L-l . 



r ced 



I CCD 
Wo i 



H(1:K, n-L) H y 




h{\:K.n-l) H Ty 

rcD) 



L-2 



.T 




h(V.K, n-L)Ty 



spreading 
ff{n-L+l) 




matched 
filtering 
H(n-L) H 



H(1:K, n-L) H T L y 



fi(l:A'. n-L)T L y 



Fia. 1 



Multistage detector for asynchronous CDMA systems. Here, fi(l : K,n) = [*isj n) , * 2 S2™'> • ■ • 



Mi 



The class of multistage detectors includes many popular multiuser detectors: 

• the single-user matched filter for L — 1, 

• the linear parallel interference canceller (PIC) |fl9ll , ll20ll for weight coefficients chosen irrespective of 
the properties of the transfer matrix 7i, 

• the polynomial expansion detector J6J and the conjugate gradient method [7J, if the weight coefficients 
are identical for all users and chosen to minimize the mean square error, 

• the (reduced rank) multistage Wiener filter [51 if the weight coefficients are chosen to minimize the 
mean square error, but are allowed to differ from user to user. 

Throughout this work we refer to detectors that minimize the MSE in the projection subspace of the user of 
interest as optimum detectors in the MSE sense. More specifically this class of multistage detectors includes 
the linear MMSE detector and the multistage Wiener filter but not the polynomial expansion detector. 

In the following we focus on the design of multistage Wiener filters implemented as in Figure \T\ This 
reduces the problem to the design of the filter coefficients . The multistage Wiener filter for the detection 
of the symbol m transmitted by user k reads 

M { ™ ] = J2(^ { k n) )i-M n)H ^- (13) 
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The weight vector w k m ^ that minimizes the MSE E{|| Ad^'y — b k m> \\ 2 } is given by 



(m). 



' C"i) 112- 



w k m ^ = argminE 

_(m) 



argminE 



I 

/ II — (m) 



L-l 
£=0 



— (m)^ jjira)H 



l r e y-b 



(m) 
k 



HA™-) u( m ) 



.fm) I I 



(14) 
(15) 



where x k is an L-dimensional vector with j th element (x^)j = U k x y. This optimization prob- 



lem is solved by the Wiener-Hopf theorem [12711 and w k is given by 



w 



(m) 



? (m)\-l£(m) 



where S fe "^ = E{x k x k H } and £ = E{b^* x^}. It is straightforward to verify that in this case 



■'k 



72/ 



'k,m 



+ a 2 7t 



k,m 



(7e 3 k m + cx 2 (n 2 ) k , m 



in 



)k,m 



a 2 (11 



)k,m 



L+V 



k , m + a 2 (TZ L ] 



k,m 



(n 2L ) k , m + a 2 (n 2L ~ v 



k,m 



k,mi 



(n 2 



'fc,m 



• • • ) ('^- i )fc,m) • 



where (lZ s ) k)m = h k T s h k ' is the diagonal element of the matrix IV corresponding to the m 



symbol transmitted by user k. 



(16) 



(17) 



th 



V. Universal Weight Design 

Consider the SINR of any linear detector that admits a multistage representation. Let w k>m be the weight 
vector for the detection of the m th symbol transmitted by user k. Then, the SINR at the output of the 
multistage detector is given by 



—(m)H ^(m) g(m)T—(m) 

SINR.fr = ; — T77 — —, — : ;— ; — 777 . (18) 

W k [ a k -$k €fc )W k (m)H 



The performance of multistage Wiener filters simplifies to 

Am)T„(m) -Mm) 

SINR fc = ^ k , ~* , . g f . , . (19) 

From (fTBT) . (fl"8~l) . and ( fT9l it is apparent that the diagonal elements of the matrix IV play a fundamental role 
in the design and analysis of multistage detectors. 
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It has been shown in [2] that, if the spreading sequences are random and the CDMA system is synchronous, 
the diagonal elements of the matrix H s , s G Z + , converge to deterministic values as K, N — > oo with 
constant ratio. This asymptotic convergence holds for some classes of random matrices and is a stronger 
property than the convergence of the eigenvalue distribution. The Stieltjes transform of the asymptotic 
eigenvalue distribution of 1Z is related to the SINR at the output of the linear MMSE detector, as pointed 
out first in ll22l for synchronous CDMA systems. The asymptotic eigenvalue moments of 1Z enable the 
asymptotic performance analysis of reduced rank multistage Wiener filters ||231 and the design of multistage 
detectors with quadratic complexity order per bit lfl4l . |[T3l . The convergence of the diagonal elements 
of 1Z S has been utilized in for the design of multistage detectors with linear complexity order per bit 
in synchronous CDMA systems and for the asymptotic analysis of any multistage detector not necessarily 
optimum in a MSE sense. In the following we extend the results in to the case of asynchronous CDMA 
systems making use of the asymptotic properties of the random matrix 1Z for asynchronous CDMA systems. 

The design of low complexity multistage detectors is based on the approximation of the weight vectors 
w k"^ by me i f asymptotic limit when K, N — > oo with constant ratio (5 



wT= lim Sj^" 1 ^. (20) 



K=f3N->oc 

Thanks to the fact that the diagonal elements of 71 s can be computed by a polynomial in few macroscopic 
system parameters, the computation of the weight vectors becomes independent of the size of 1Z and inde- 
pendent of m. Thus, the effort for the computation of the weights becomes negligible and the complexity 
of the detector is dominated by the joint projection of the received signal y onto the subspaces \^ m ^(7i), 
k — 1 . . . K and m G Z. This projection has linear complexity per bit if the multistage detector in Figured] 
is utilized. 

The convergence of the diagonal elements of 1Z £ to deterministic values is established in the following 
theorem. The definitions and the assumptions in the statement of Theorem Q] summarize and formalize the 
characteristics of system model © for r k G [0, T s ]. 

Theorem 1 Let K, N G N and A G C KxK be a diagonal matrix with k th diagonal element G C. 
T s and T c are positive reals with T s = NT C . Given {tx, r 2 , . . . r K } a set of delays in [0, T s ), we intro- 
duce the sets of delays in [0, T c ) defined as \jk ■ Tk = TfcmodT c , k = 1, ...K} and the set of nor- 
malized delays |r fe :r k = ^ j • Given a function $(c<j) : R — > C, let (p(Q,r) be as in (HJ). Given 
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a positive integer r, let <fr k , k = 1, . . . K, be r -block-wise circulant matrices of order N defined in rfTOl) 
and S {m) = ($i4 m) 

, <f>2S < 2 ri \ ■ ■ ■ *^kS^\ J with N -dimensional random column vector. Let H = 
(i2l m)T , H { ™ )T ) T = SA with e C rNxK and H the infinite block row and block column ma- 

trix of the same form as in T = TCH H , 1Z = 7i H 7i, and U™ the column ofTi corresponding to 

We assume that the function is upper bounded and has finite support. The receive filter is such 
that the sampled discrete time noise process is white. The vectors s k are independent with i.i.d. zero- 
mean circularly symmetric Gaussian elements with variance E{\sij\ 2 } = N^ 1 . Furthermore, the elements 
a k of the matrix A are uniformly bounded for any K. The sequence of the empirical joint distributions 
-Fj^] ~(A,r) = Yl?k=i ~~ l a fc| 2 )l(^ ~~ Tk) converges almost surely, as K — > oo, to a non-random 
distribution function F, A , 2 f(X,T). 

Then, conditioned on (\a k \ 2 , r k ), the corresponding diagonal elements of the matrices lZ e converge almost 
surely to the deterministic value 

lim (n%, m = lim U k m)H TU k m) = R t (\a k \ 2 ,r k ) (21) 

with Re(\a k \ 2 ,r k ) determined by the following recursion 

e-i 

R e (X,r) = Y,9(Ti- s -i,X,r)R s (X,r) (22) 



s=0 



and 



i-i 



T e (Q) = f(Re-s-i, fi)T a (ft) -TV < n < n (23) 



s=0 



-7T < Q < 7T (24) 



f(Rt,n)=/3 J AA^ r (fi,r)Aj r (fi, r)R e (X,T)dF lAl 2 >T (X,r) 
g(T e , X, r) = A j A * (fi, r)T e (Q) A^Q, r)d Q (25) 
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with 



( 



\ 



r 



(26) 



Tc(r-1) 



r 



77ze recursion is initialized by setting T (Q) = I r and Rq(X, t) = 1. 

Theorem [T]is proven in Appendix HI 

Note that the asymptotic diagonal elements of TC depend on the delay r k only via the delay of a chip pulse 
waveform within a chip, i.e. via r k , while any delay multiple of T c leaves the diagonal elements unchanged. 

From Theorem Q] we can obtain , the asymptotic eigenvalue moment of the matrix TZ of order £ by 
using the relation 



where the expectation is taken over the limit distribution F, A , 2 f(X,r). For r = 1 and F, A , 2 f(X,r) = 
F\ A \2 (X)S(t), i.e. for synchronous systems sampled at the chip rate, and $(a>) satisfying the Nyquist criterion 
the recursive equations (T2~3T) . ([24]) . and (1251) reduce to the recursion in [2 J Theorem 1. 

This theorem is very general and holds for all chip pulses of practical interest. Furthermore, no constraint 
is imposed on the time delay distribution. The choice of the front end in this work is restricted only by the 
applicability of (TT8T) or (fT9l) , which imply white noise at the front end. Then, since both Front-end A and 
Front -end B keep the sampled noise white, Theorem Q] applies to both of them. 

Now, we specialize Theorem[[]to a case of theoretical and practical interest, where sufficient statistics are 

utilized in the detection, the chip pulse waveform <p(t) is band-limited, and the sequence of the empirical 

distribution functions of the time delays converges to a uniform distribution function as K — > +oo. The 

constraint to use sufficient statistics restricts the class of front-ends. The following results apply to Front-end 

A but, in general, not to Front-end B. 

Corollary 1 Let us adopt the same definitions as in Theorem Q] and let the same assumptions of Theorem 
\l\be satisfied. Additionally, assume that the random variables X and r in F, A , 2 f(X, r) are statistically 
independent and the random variable r is uniformly distributed. Furthermore, is bounded in absolute 
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value, and bandlimited with bandwidth B < Then, given (|afc| 2 , %) and m G Z, the corresponding 
diagonal element of the matrix lZ e converges almost surely to a deterministic value, conditionally on \a,k\ 2 , 

lim (K%, m = lim ut )H r"- l ut )a = Ri{\a k \ 2 ) 

K=/3N->oo K=j3N^co K K 

with -R^ (A) | A=|a fc | 2 determined by the following recursion: 

1-1 



s=0 



and 

r 1 

T t {w) = yY1 f(Ri-s-i)jr 1$ Ml* T,{u) -2nB <uj<2uB 

c s=0 c 

f(Ri)=P J XR t (X)dF\^{\) 

u t = ^=- I \§{u)\ 2 T t {u)&u. 
^L c J_ 2nB 

The recursion is initialized by setting T (u) = 1 and Rq(X) = 1. 
Corollary [T]is derived in Appendix HB 

The eigenvalue moments of 1Z can be expressed in terms of the auxiliary quantities f(R s ) and v s in the 
recursion of Corollary \T\ by the following expression: 



— u\*w\i)J — 

s=0 



E{R e (X)} = J2f( R s)^-s-i. 



Applying Corollary \T\ we obtain the following algorithm to compute the asymptotic limits of the diagonal 
elements of 7t and its eigenvalue moments. 

Algorithm 1 



Initialization: Let po(z) = 1 and Ho(y) = 1. 

/ th step: • Define = ryfi£_i(y) and write it as a polynomial in y. 

• Define V£-i(z) = zp i _ 1 (z) and write it as a polynomial in z. 
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Define 

£. = — / T c mu)\ 2s duj (27) 

and replace all monomials y, y 2 , . . . , y l in the polynomial u^x{y) by £i/T c , £ 2 /T c , . . . 
£i/T c , respectively. Denote the result by Ut-\. 

Define mf A < 2 = E{|afc| 2s } and replace all monomials z,z 2 , . . . , z e in the polynomial 

VI 2 ' "VI 2 ""' VI 2 



vg_i(z) by the moments Tt[^, 2 , rnf^ 2 ,..., wfjL, respectively. Denote the result by 



• Calculate 



Pti z ) = y^ J zU i - a -ip a (z) 

s=0 

e-i 

Hi(y) = — ^2/3yV £ - s -ip s (y). 

±c s=0 



Assign pe(X) to R e (\). 

"\A\ 



Replace all monomials z, z 2 , . . . , z l in the polynomial pe(z) by the moments rn9^ 2 , 



/m C^) i-s) £• n /->/■ ■/ j i i /) / a i si m si st f f t s\ n 4-T/t n i"s) c 1 1 1 1~ i s\ mn 



mS^, 2 ,. . . , "V|2, respectively, and assign the result to 



m 



Algorithm Q] is derived in Appendix Hill 

Interestingly, the recursive equations in Corollary \T\ do not depend on the time delay of the signal of 
user k, i.e. the performance of a CDMA system with multistage detection is independent of the sampling 
instants and time delays if the assumptions of Corollary \T\ on the chip waveforms and on the time delays are 
satisfied. 

Additionally, the dependence of R e (X) on the chip pulse waveforms becomes clear from Algorithm [Q 
R e (X) depends on through the quantities £ s , s — 1, 2, . . ., defined in (1271) . 
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By applying Algorithm \T\ we compute the first five asymptotic eigenvalue moments 

(1) r (1) c 

J- c 

™% = (^) 2 [/3(m« |2 )^ 2 + m[2 |2 ^] 

= (0 [^(m« |2 ) 3 + 3m^y 2 /3m« 2 £ 1 + m[5 |2 ^] 



m TZ 



(4) " ( y ) [2/3 2 £fm[2 |2 (m« 2 ) 2 + 4^ 2 mg 2 m« + ^^m^m^p) 2 + /3 3 £ 4 (m« |2 ) 4 
+2/5^ 2 (m[5 |2 ) 2 + £: i 4 m[5 |2 ] 



m^Esp + £f (m,A|0 + 5/m£ 4 m[2 13 (m[2 13 ) d + 5/3^3^m^ |2 (m^ |2 ) 



+5/? 2 £:3£: 2 m 3 A|(2) (m« 2 ) 2 + 5/3 2 £ 2 £ 3 (m[2 p ) 2 m« 2 + 5/? 2 ^^ 2 (m[2 |2 ) 2 m« 2 
+5/5 2 ^ 2 ^m[2 |2 (m^ |2 ) 2 + 5/3£ 2 £ 3 m$ |2 m^ |2 + 5£ 2 £?mf2 |2 mf2 |2 ]. 

In general, the eigenvalue moments of 1Z depend only on the system load 3, the sampling rate the 
eigenvalue distribution of the matrix A H A, and E s , s G Z + . The latter coefficients take into account the 
effects of the shape of the chip pulse or, equivalently, of the frequency spectrum of the function (f>(t). The 
asymptotic limits of the diagonal elements of the matrix 7t corresponding to user k depends also on |afc| 2 
but not on the time delay Tfc. 

In the special case of chip pulse waveforms ij)(t) having bandwidth not greater than the half of the chip 
rate, i.e. B < the result of Corollary \T\ holds for any sets of time delays included synchronous systems. 

In Theorem [2l chip pulse waveforms with bandwidth B < are considered and the diagonal elements 

of TV are shown to be independent of the time delays of the active users. 



Theorem 2 Let the definitions of Theorem\T\hold. 



_7T_ _7T_ 



The 



We assume that the function $(cj) is bounded in absolute value and has support S C 
vectors s& are independent with i.i.d. Gaussian elements s n k G C such that E{s n fc} = and E{|s n fc| 2 } = 
jr. Furthermore, the elements of the matrix A are uniformly bounded for any K. The sequence of the 
empirical distributions ^^(A) = Yl!k=i -"-(A — l a fc| 2 ) converges in law almost surely, as K — > 00, to a 
non-random distribution function i^Al 2 (A). 

Then, given |afc| 2 , the n-th diagonal element of the matrix lZ e , with n mod/^ = k, converges almost 
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surely to a deterministic value, conditionally on \a,k\ 2 , 



lim (H\ m = lim U™ H T i - 1 U™ ^ R t {\a k \ 



with Ri(\ak\ 2 ) determined by the following recursion 



£-1 



Re{\) =^\R s {\)u l - a - 1 (28) 



s=0 



and 



t-i 

T e (u) = ^Y,/3f(R t - 8 - 1 )-\$(u } )\ 2 T s (u) ueS (29) 



/(^)= / A^(A)d J F| A , 2 (A) (30) 



™ 2 



^ = 7T^ / |^H| 2 ^Hda;. (31) 



2ttT c J5 

r/ze recursion is initialized by setting Tq(u) = — and Rq(\) = 1. 

Theorem [2] is shown in Appendix [IV] It applies to Front-end A but, in general, not to Front-end B since 
Front-end B implies the use of root Nyquist pulses. It is straightforward to verify that Algorithm Q] can be 
applied to determine Re(\), the asymptotic limit of the diagonal elements and the eigenvalue moments of 
matrices 1Z satisfying the conditions of Theorem |2l 

The mathematical results presented in this section have important implications on the design and analysis 
of asynchronous CDMA systems and linear detectors for asynchronous CDMA systems. We elaborate on 
them in the following section. 

VI. Effects of Asynchronism, Chip Pulse Waveforms, and Sets of Observables 

The theoretical framework developed in Section |V] enables the analysis and design of linear multistage 
detectors for CDMA systems using optimum and suboptimum statistics and possibly non ideal chip pulse 
waveforms. In this section we focus on the following aspects: 

1) Analysis of the effects of chip pulse waveforms and time delay distributions when the multistage detec- 
tors are fed by sufficient statistics. 

2) Impact of the use of sufficient and suboptimum statistics on the complexity and the performance of 
multistage detectors. 
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A. Sufficient Statistics 

Sufficient statistics impaired by discrete additive Gaussian noise are obtained as output of detector Type 
A. For chip pulse waveforms with bandwidth B < and any set of time delays, Theorem [2] applies. 
For B > and uniform time delay distribution, Corollary [TJ holds. In both cases, as K, N — > oo with 
constant ratio the diagonal elements of the matrix lZ e and the eigenvalue moments can be obtained 
from Algorithm [TJ As a consequence of (fT8l) . the performance of the large class of multiuser detectors that 
admit a representation as multistage detectors depends only on the diagonal elements TZ e and the variance 
of the noise. In large CDMA systems, the SINR depends on the system load j3, the sampling rate the 
limit distribution of the received powers ^^(A), the variance of the noise a 2 , the coefficients £g, i E Z + 
and the received powers \ak\ 2 , but it is independent of the time delay Tfc, in general. For B < ^r, the SINR 
is also independent of the time delay distribution. Therefore we can state the following corollary. 

Corollary 2 If the bandwidth of the chip pulse waveform satisfies the constraint B < ^r, large synchronous 
and asynchronous CDMA systems have the same performance in terms of SINR when a linear detector that 
admits a representation as multistage detector is used at the receiver. 

If the time delays and the received amplitudes of the signals are known at the receiver and the sampling rate 
satisfies the conditions of the sampling theorem, synchronous and asynchronous CDMA systems have the 
same performance. In ll24l is established the equivalence between synchronous and asynchronous CDMA 
systems using an ideal Nyquist sine waveform (B = ^-) and linear MMSE detector. Corollary [2] generalizes 
that equivalence to any kind of chip pulse waveforms with bandwidth B < and any linear multiuser 
detector with a multistage representation. 

By inspection of Algorithm [TJ we can verify that the dependence of Ri(\a k \ 2 ) and on the sampling 
rate ■L- can be expressed by the following relations 




(32) 



and 




(33) 
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where R\ ( | a k | 2 ) and rrCpf' are independent of the sampling rate ^- . Thanks to this particular dependence and 
the fact that a 2 = jrN , the quadratic forms appearing in (Tl8l) ^ m S^^ fe m , £^ m H _1 £, and £ E^Efc^E -1 ^, 
are independent of the sampling rate for large systems, when specialized to multistage Wiener filters and to 
polynomial expansion detectors. Thus, the large system performance of (i) linear multistage detectors op- 
timum in a mean square sense (see ([791)). (ii) of the polynomial expansion detectors and (iii) the matched 
filters is independent of the sampling rate. This property is not general. Detectors that are not designed 
to benefit at the best from the available sufficient statistics may improve their performance using different 
sets of sufficient statistics. Therefore, the large system performance of other multistage detectors like PIC 
detectors depends on the sampling rate and can eventually improve by increasing the oversampling factor r. 
Given a positive real 7, let us consider the chip pulse 

^ for U < 

(34) 

otherwise 

corresponding to a sine waveform with bandwidth B = and unit energy. For waveform (1341) with 7 = 1, 
T c = 1, and r = 1 Algorithm CD reduces to Algorithm 1 in lfT8l for synchronous systems. Let us denote by 
E!f yn \\a k \ 2 , (3) and Tn^ isyn) (/3) the values of Re(\a k \ 2 ) and rw^ for such a synchronous case and system load 
(3. Then, in general, for chip pulse waveform (1341) Algorithm Q] yields 

>( sin c)/i |2\ _ / r \ -r>(syn) / |„ 1 2 P 



and 



m^ (smc) = m^ (syn) ^ ) . (36) 



Therefore, the same property pointed out in part I of this paper [[TJ for linear MMSE detectors holds for 
several multistage detectors (namely, multistage Wiener filters, polynomial expansion detectors, matched 
filters): In a large asynchronous CDMA system using a sine function with bandwidth as chip pulse 
waveform and system load j3 any multistage detector whose performance is independent of the sampling 
rate performs as well as in a large synchronous CDMA system with modulation based on root Nyquist chip 
pulses and system load f3' — ^. 

The comparison of synchronous and asynchronous systems with equal chip pulse waveforms enables us 
to analyze the effects on the system performance of the chip pulse waveforms jointly with the effects of 
the distribution of time delays. We elaborate on these aspects focusing on root raised cosine chip-pulse 
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waveforms with roll-off t? £ [0, 1] and on chip pulse waveforms (1341) with 7 £ [1,2]. To simplify the 
notation, we assume T c = 1. Let 



sin 



< |w| < 7T(1 - 0) 

7T(1 — 0) < |w| < 7T(l+0) 

|w| > 7T(1 +7?). 



The energy frequency spectrum of a root raised cosine waveform with unit energy is given by | ^ 



sqrc 



UJ 



S(u). The large system analysis of an asynchronous CDMA system using root raised cosine chip pulse 
waveform is obtained applying Algorithm [TJ The corresponding coefficients £ sqrCi <,, s = Z + , are given by 



1 /•7r(l+7) 

£ sqrM =2 s (l - 7) + - / sin s 



— vk-uj) Ida;. 
27 



It is well known that in a synchronous CDMA system the performance is maximized using root Nyquist 
waveforms. In this case the performance is independent of the specific waveform and the bandwidth. It 
equals the performance of a large synchronous system using the sine function with bandwidth as chip 
pulse. Since the root raised cosine pulses are root Nyquist waveforms, they attain the maximum SINR in 
synchronous systems. The large system performance of multistage Wiener filters for synchronous CDMA 
systems with a root raised cosine waveform is obtained making use of (fT9l) and Algorithm [TJ with r = 1 and 
S s = 1, s £ Z+. 

In general, chip pulse waveform (j34|) is not a root Nyquist waveform. For this reason the performance 
analysis of linear multistage Wiener filters for synchronous CDMA sytems lfT4ll . [TT8l is not applicable. 
In this case characterized by interchip interference we can still apply Theorem [TJ sampling at rate ^- and 
assuming a Dirac function /t(t) = 8(r) as probability density function of the time delays. For the chip 
pulse waveform (|34l ), the matrix Q(fl) = A$ i2 (Q, 0)A^ 2 (Q, 0) used in the recursion of Theorem [TJ is given 
by 



Q(fi) 



/ 



\ 



1 



4 




\n\ < 2tt (1 - 1) 



2tt (1 - I) < |0| < 7T. 



The large system analysis in the asynchronous case with chip pulse (1341) can be readily performed making 
use of (HI and 03). 
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In Figure [2] the large system SINR at the output of a multistage Wiener filter with L = 4 is plotted as a 
function of the bandwidth for synchronous and asynchronous CDMA systems based on modulation by root 
raised cosine or by pulse (|34|) . We assume perfect power control, i.e. A = I, system load j3 = 0.5, and 
input SNR = 10 dB. 

It is well known from the theory on synchronous CDMA that interchip interference colors the discrete- 
time spectrum of the signal and degrades performance. Consistently with this effect, Figure [2] shows that 
synchronous CDMA root raised cosine pulses outperform sine pulses with non-integer ratios of bandwidth 
to chip rate, since the formers avoid interchip interference. Asynchronous CDMA systems with both chip 
pulse waveforms widely outperform the corresponding synchronous systems. In contrast to the synchronous 
case, sine pulses exploit the additional degrees of freedom introduced by increasing the bandwidth better 
than root raised cosine pulses, since they do not color the spectrum in continuous time domain. Thus, an 
asynchronous CDMA system with sine pulses considerably outperforms a system using root raised cosine 
pulses. Note that for asynchronous systems, the spectral shape in continuous time is relevant, while for 
synchronous systems the spectral shape in discrete time matters. In both cases the spectrum should be as 
white as possible to achieve high performance. For asynchronous systems, the spectrum is the less colored, 
the closer the delay distribution resembles an (eventually discrete) uniform distribution. 

In Figure [3] the SINR at the output of a multistage Wiener filter with L = 8 is plotted as a function of the 
system load, parametric in the bandwidth, for SNR = 10 dB. The improvement achievable by asynchronous 
systems over synchronous systems increases as the the system load increases. 

B. Chip Rate Sampling 

Chip rate sampling is a widely used approach to generate statistics for asynchronous CDMA systems. It 
implies the use of root Nyquist chip pulses and makes use of front end Type B. Hereafter, we refer to these 
CDMA systems as systems B, while we refer to the systems that use sufficient statistics from a front end 
Type A as systems A. 

A bound on the performance of systems B with linear MMSE detectors is in ll2~5l . The performance 
analysis of linear multistage detectors as K , N — > oo with ^ — > j3 can be performed applying Theorem[T]to 
the chip pulse waveform at the output of the chip matched filter = I — 1^(^) | 2 and assuming r = 1. 



In order to elaborate further on systems B we focus on the root-raised cosine chip pulse with roll-off 9 11261 

4B(±) cos(tt(1 + 9)±) + sin(7r(l - 6)±) 

Ui-(i*>') 9eM - 01 
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AWGN channel, p = 0.5, SNR = 10 dB, L = 4 



AWGN channel, SNR = 10 dB, M = 8, chip rate = 1 Hz 




5 io 



Z 




— • — synch, root Nyquist pulses, bandwidth [ 1 , +°°) 
— N — asynch. sine pulses, bandwidth 1 .5 Hz, r> = 0.5 

asynch. sine pulses, bandwidth 2 Hz, r> = 1 

- H - asynch. root raised cosine, bandwidth 1 .5 Hz, r> = 0.5 
asynch. root raised cosine, bandwidth 2 Hz, $ = 1 



0.2 0.4 0.6 0.8 1 1.2 1.4 1.6 1.8 2 
system load p 



Fig. 2 



Fig. 3 



Output SINR of a multistage Wiener filter with L = 4 Output SINR of a multistage Wiener filter with L = 8 

VERSUS BANDWIDTH. CDMA SYSTEMS WITH EQUAL RECEIVED VERSUS THE SYSTEM LOAD. ASYNCHRONOUS CDMA SYSTEMS 



POWERS, ROOT RAISED COSINE CHIP WAVEFORMS OR SINC 



WITH EQUAL RECEIVED POWERS, ROOT RAISED COSINE CHIP 



PULSES, SYSTEM LOAD P = \ AND INPUT SNR = 10 DB ARE WAVEFORMS OR SINC PULSES WITH BANDWIDTH B = 1.5, 2 HZ, 



CONSIDERED. 



INPUT SNR = 10 DB ARE COMPARED TO SYNCHRONOUS CDMA 



SYSTEMS WITH ROOT NYQUIST CHIP PULSES. 



In this case, the matrix function Q(fi, r) = A^ i(f2, r)A? 1 (fi, r) occurring in Theorem \T\ reduces to the 
scalar function 



Q(n, 



\ + \ sin2 (U n + 7r )) + £2 ^ L (! - sin2 (U n + 7r ))) < ^ < -tt(1 - 9) 

1 — 7T(1 - 9) < n < 7T(1 

J + i Sin2 - *■)) + (! - Sin2 - 7r ))) Hl-9)<tt< 7T. 



due to the fact that r = 1. Equal received powers, system load /? = \, multistage Wiener filters with L = 3 
define the scenario we consider for the asymptotic analysis. 

The analysis shows a strong dependence of the performance on the time delays. As expected, it is possible 
to verify that the best SINR is obtained when the sampling instants coincide with the time delays of the user 
of interest. 

In Figure |4] we compare the performance of system B with root raised cosine chip pulse to the SINR of a 
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root raised cosine pulse, system load p = 0.5, L = 3 



Di 6 

z 



0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1 
roll-off 



Fig. 4 

Asymptotic output SINR of a multistage Wiener filter with L = 3 versus the roll-off 9 as front-end A 

(DASHED LINES) AND FRONT-END B (DOTS) ARE IN USE IN AN ASYNCHRONOUS CDMA SYSTEM. THE SOLID LINES SHOW 
THE REFERENCE PERFORMANCE IN SYNCHRONOUS CDMA SYSTEMS. THE CURVES ARE PARAMETRIC IN THE INPUT SNR 

WITH SNR VARYING BETWEEN DB AND 20 DB IN STEPS OF 5 DB. 



system A with the same modulating pulse. In the comparison we consider the best SINR for system B ob- 
tained when the sampling times coincide with the time delays of the user of interest. The curves represent the 
output SINR as a function of the roll-off 9 parameterized with respect to SNR. The parameter (SNR) varies 
from dB to 20 dB in steps of 5 dB. As reference we also plot the performance of synchronous CDMA sys- 
tems. As expected, multistage detectors with front-end A outperform the corresponding multistage detectors 
with front-end B. 

Interestingly, while linear multistage detectors and asynchronism in system A can compensate to some 
extent for the loss in spectral efficiency caused by the increasing roll-off and typical of synchronous CDMA 
systems such a compensation is not possible in systems B. Systems B behave similarly to synchronous 
CDMA systems. In fact, the SINR for system B is very close to the performance of synchronous systems 
for any SNR level. 

A thorough explanation of these properties based on general analytical results is in Part I Section V HI. 
We recapitulate the main idea briefly here. The performance of a large asynchronous CDMA system is 
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governed by an r x r matrix function in the frequency domain (eq. (24) in HlJ^. To give an intuition, 
the system is then equivalent to a MIMO system with r transmit and r receive antennas. The structure of 
this matrix is such that the matrix is necessarily rank one for synchronous CDMA systems. Thus, only 
one dimension of the signal space is spanned. On the contrary, for arbitrary delay distributions, i.e. in 
general for asynchronous systems, the rank of the MIMO system can be higher, eventually, up to r. This 
implies that asynchronous systems span more of the available dimensions of the signal space resulting in 
better exploitation of it. When the received signal is sampled at the chip rate, as in the case of Front-end 
B, and r = 1 the processed signal for an asynchronous system only spans a single dimension, just like in 
synchronous systems, and the performances of synchronous and asynchronous systems are very similar. 

Since the SINR in system B heavily depends on the sampling instants with respect to r^, different statistics 
are needed for the detection of different users in order to obtain good performance. As consequence, joint 
detection is not feasible and each user has to be detected independently. This is a significant drawback when 
several or all users have to be detected (e.g. uplink) and has a relevant impact on the complexity of the 
system. For example, the complexity order per bit of a multistage Wiener filter or polynomial expansion 
detector is linear in rK in system A while the complexity order per bit of the same detectors is quadratic in 
K in system B. A similar increase in complexity can be noticed also for other detectors (e.g. linear MMSE 
detectors, or any multistage detector). 

VII. Conclusions 

In Part II of this work we provided guidelines for the design of asynchronous CDMA systems via the anal- 
ysis of the effects of chip pulse waveforms, time delay distributions, sufficient and suboptimum observables 
on the complexity and performance of the broad class of multiuser detectors with multistage representation. 

Similarly to the results obtained in part I of this article [1], i.e. the chip-pulse constrained spectral effi- 
ciency and the performance of linear MMSE detectors, multistage detectors show performance independent 
of the time delays of the active users if the bandwidth of the chip pulse waveform is not greater than half of 
the chip rate, i.e. B < -f-. Above that threshold the performances of linear multistage detectors depend on 
the time delay distributions and asynchronous CDMA systems outperform synchronous CDMA systems. 

The framework presented here enabled the analysis of optimum and suboptimum multistage detectors 
based on front ends whose sampled noise outputs are white. We focused on multistage detectors using 

4 Note that the matrices Ti (fi) in TheoremQ]can be interpreted as expansion coefficients of this matrix. 
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statistics dA]), which are sufficient, or observables (O, which are suboptimum. In the two cases of (i) chip 
pulses with bandwidth B < and (ii) chip pulses with bandwidth B > sufficient statistics, and 
uniform distribution, the effects of the chip pulse waveforms on the detector performance are described 
by the coefficients S s = 2?r7 l-i JI^Hb \ ^(u)\ 2s duj. The output SINR of linear MMSE detectors, multistage 
Wiener filters, polynomial expansion detectors, and matched filters is independent of the sampling rate. In 
contrast, the output SINR of other multistage detectors like PIC detectors depends on the sampling rate and 
increases with it. 

Comparing the performance of synchronous and asynchronous CDMA systems with modulation based 
on root Nyquist pulses, namely root raised cosine waveforms, and modulation based on sine functions with 
increasing bandwidth, it becomes apparent that the chip pulse design for synchronous CDMA systems fol- 
lows the same guidelines as the chip pulse design for single user systems. In contrast, chip pulse design for 
asynchronous CDMA systems is governed by entirely different rules. In fact, for example, we found that 
CDMA systems with uniform delay distributions perform well if the spectrum of the received signal is as 
white as possible. 

The asymptotic analysis of asynchronous CDMA systems using statistics © shows that the performance 
of multistage Wiener filters is close to the SINR of the corresponding synchronous CDMA systems for any 
bandwidth and level of SNR. Therefore, this kind of front-end is not capable of exploiting the benefits of 
asynchronous CDMA. 

The universal weights proposed for the design of low complexity detectors account for the effects of asyn- 
chronism, sub-optimality of the statistics, and non-ideality of pulse-shapers. They depend on the sampling 
rate although the large system performance of some multistage detectors, namely multistage Wiener filters, 
polynomial expansion detectors, and matched filters, does not. 

From the asymptotic analysis and design performed in this work we can draw the following conclu- 
sions: 

• Multistage detectors with front end Type |B] and universal weights are asymptotically suboptimal and 
have the same complexity order per bit 0(K 2 ) in uplink as the linear MMSE detector. 

• Multistage Wiener filters and polynomial expansion detectors with statistics|A]and universal weights are 
asymptotically optimum and have the same complexity order per bit as the matched filter, i.e. 0(rK) 
with r <^ K. 

• If only a user has to be detected, multistage detectors using statistics © have slightly lower complexity 
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than multistage detectors with statistics (©, namely they have a complexity per bit 0(K 2 ) while in the 
later case the complexity per bit is 0(rK 2 ). However, they perform almost as the multistage detectors 
for synchronous systems at any SNR and do not provide the gain in performance due to asynchronism 
in contrast to statistics (TAl) . 
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Appendix I 
Proof of Theorem CD 

Before going into the details of the proof we introduce some properties of the convergence in probability 
and the almost sure convergence or convergence with probability one. 

Property A: Let us consider a finite number q of random sequences {al 1 '}, . . . , {an } that converge in 
probability to deterministic limits ai, . . . , a q , respectively. Then, any linear combination of such sequences 

(s) V / _ ■ 

converges in probability to the linear combination of the limits. Furthermore, if \a n — a s \ — > o(N^' ta ), 
with i s E R + , and s = l,...q, then any linear combination of the random sequences converges as 
(jV"-min.=i,...,(i.)) > at worst. 

Property B: Let {a n } and {b n } be two random sequences that converge in probability to a and b, respec- 
tively. Then, the sequence {a n b n } converges in probability to ab. 

Property C: If for large n, Pr{|a n — a\ > e} < o(n~ s ) and Pr{|6 n — b\ > e} < o(n _t ), with s,t E M + , 
then also Pr{|(a n - a)(b n - b)\ > e} < o(n _min(s '* ) ), at worst. 

The convergence with probability one or almost sure convergence implies the convergence in probability. 
In general, the converse is not true. However, if a random sequence converge in probability to a constant 
a with a convergence rate o(n~ s ) and s > I, i.e. Pr{|a n — a| > e} < o(n~ s ), then, also the convergence with 
probability one holds. This is a straightforward consequence of the Borel Cantelli lemma (see e.g. Il27l0 . 

In part I Theorem 3 of this work [[TJ| we have shown that, when K,N —> +00 with constant ratio j3, 
the eigenvalue distribution of the infinite matrix 1Z is the same as the eigenvalue distribution of the matrix 
R = A H S SA = H H where S = (<E>iSi, $2^2, • • • &kSr) and is the r-block-wise circulant 
matrix of order iV defined in (flOl) with = mod T c . 
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Let us consider the block diagonal matrix A^ r (rfe) with r x 1 blocks 



and introduce the matrices 



/ 



(38) 



S = (A^ r (ri)si, A^ r (r 2 )s 2 , . . . A^r^s^) 



(39) 



and # = A H S H SA. 

By applying the same approach as in part I Theorem 1 of this work [OQ it can be shown that the eigenvalue 
distribution of the matrices R and R coincide. Then, also the eigenvalue moments of the two matrices 
coincide. The same property holds for the diagonal elements of the matrices R and R with i G Z . 

In the following we focus on the asymptotic analysis of the diagonal elements of the matrices R . 

Throughout this proof we adopt the following notation. For k = 1, . . . , K and n — 1, . . . , N 

• h k is the k th column of the matrix H; 

• h nk is the n th r x 1 block of the vector h k and h nk = a k {^ r (j k )) nn s nk ; 

• S n is the n th block row of H of dimensions r x K; 

• H\= n is the matrix obtained from H by suppressing 8 n ; 

• H^ k is the matrix obtained from H by suppressing h k , 
. T = HH H and TU = H^ k H* k ; 

• R\=n — H^ n H\= n ; 

• o* n = (s n i, s n 2, . . . , s n i^); 



. V- t , for t 



1, . . . , r and n — 1, . . . ,N, is a K x K diagonal matrix with the k th element equal to 
' . Note that er n V ni4 A coincides with the (t + (n — l)r) th row of the matrix H. 



• Tr nn i is the n diagonal block of T of dimensions r x r. 
Furthermore, since the channel gains a k are bounded, we denote by cxmax their upper bound, i.e. \a k \ < 
«max, Vfc. Finally, thanks to the assumption that $(cj) is bounded in absolute value with finite support also 
t) is upper bounded for any £1 and r. We denote by $max its bound. 

Let us observe first that the eigenvalue moments of the matrix R (or equivalently of T) are almost surely 
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upper bounded by a finite positive values C^ s \ i.e. 

3C (s) <+oo: Pr j-^R < = 1 as K, N -» +00, ^ -> /3. 



(40) 



In fact, 



V tljR = N ^ ^ h n u k 1 h >-nM h n2,k 2 hn 2M--- h n s ,k s h n s M 

k 1 ,...k s =l n\,...n s =l 

i K N 

= M S \ a ki\ 2 ■ ■ ■ \ a k a \ 2 ^2 A ^A^l)n 1 n 1 A ^r(r 2 ) ni n 1 ---^A^s)Zn s A <t>A^] 
k\,...k s =l ni,...n s =l 

X S n 1 ,k 1 S n 1 ,k 2 S n 2 ,k 2 S n 2 ,k 3 ■ ■ ■ S n s ,k s S n s ,k 1 



Applying the approach of non-crossing partitions [12811 . 11291 , it is possible to recognize that the factors 
s ni,fc 1 s ni,fc2 s n 2 ,fc 2 'V2,fc3 ■ ■ ■ S n s ,k s s n a ,k 1 which do not vanish asymptotically, correspond to the ones having 
nonzero non-crossing partitions. Correspondingly, also the remaining factors 

A <pA^)n ini ^.r^Jmn, • • • ^A^s)n a n a A <t>A^)n a n a 

are positive and bounded by 

|A^ r (Ti) nini A^ r (T 2 ) mm • • • A^ r (7^) nsns A^ r (fi) nsns | < — — . 

Therefore, 

j\T - f2s S ni,fci S ni,fc2 S n 2 ,fc2 Sn 2.' : 3 ■ ■ ■ S n a ,k a S n s ,k 1 I • 

c \ ki,...k s =ln 1 ,...n a =l / 



The last factor in (14TI) is the s-th eigenvalue moment of a central Wishart matrix with zeromean i.i.d Gaussian 
entries having variance 4. Well established results of random matrix theory 11301 , [1291 , lfT2l show that the 
eigenvalue moments of such a matrix converge almost surely to finite values. More specifically, 



at ^ 1 S n 1 ,k 1 S rii,k 2 S n 2 .k 2 S n 2 ,k3 ■ ■ ■ S n s ,k s S n a ,ki ~^ ^ ] J j J J ■ (42) 

ni,...n s =l i=0 \ i / \ 2 + 1 



Then, appealing to (l4Tj) and (l42j) . the eigenvalue moments of the matrices R and T are upper bounded almost 
surely by 

c(s) = ^ A MAX QMAX g / M / « W (43) 

Jc i=o \ i / \ i + 1 / s 
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The proof of Theorem[T]is based on strong induction. In the first step we prove the following facts: 

1) The diagonal elements of the matrix R converge almost surely, as N — ► oo, to deterministic values 
-Ri(|ofc| 2 , Tfe), conditionally on (|afe| 2 , r k ). Furthermore, Ve > and large K = /3N 

Pr{\R kk - Ri(\a k \ 2 ,r k )\ > e} < o (N~ 2 ) . 

2) T\ nn \, the r x r block diagonal elements of the matrix T = HH , converge almost surely to determin- 
istic blocks Ti(fi), with f2 = Hhitv^oo 27r^. Additionally, Ve: > 0, large K = j3N and it, v — 1, . . . r, 

Pr{|(T [nn] ) u „ - (T^U >e}<o (N~ 2 ) . 

Then, in the recursion step, we use the following induction assumptions: 

-~ s 

1) For s — 1, ...,£ — 1, the diagonal elements of the matrix i? , converge almost surely, as K = f3N — > 
oo, to deterministic values R s (\a k \ 2 , T k ), conditionally on (\a k \ 2 ,r k ). Additionally, Ve > and large 
K = PN, Pr{\(R S ) kk - R s (\a k \ 2 , r k )\ > e} < o (N~ 2 ) . 

2) For s = 1, ...,£— 1, Tr nn u the r x r block diagonal elements of the matrix T converge almost surely 
to deterministic blocks T s (f2), with^ f2 = liniAr^oo 27T^. Additionally, Ve > 0, large K = (3N, and 
u,v = l,...r, Pr{\(T [nn] ) uv - (T.(fi)) utI | > e} <o(N~ 2 ) . 

We prove: 

1) The diagonal elements of the matrix R , converge almost surely, as K = j3N — > oo, to deterministic 
values R e (\a k \ 2 , r k ), conditionally on ( | cx/c | 2 , T k )- Furthermore, Ve > and large K = (3N 

Pi{\(R e ) kk - R,{\a k \ 2 , r k )\ >e}<o (N~ 2 ) . (44) 

2) The blocks Tr nn i, converge almost surely to deterministic blocks T e (H) with liniA^oo 2tt^. Addition- 
ally, \/e > 0, large N and u, v — 1, . . . r, 

Pr{\(T [nn] ) uv - (T,(0)U > e} < o (N~ 2 ) . (45) 

First step: Consider R kk = h k h k = \a k \ 2 s^ A.^ r (T k )A < j }ir (T k )s k . Thanks to the bound r)| < 
^max which holds for any f2 and r, also the eigenvalues of the matrix A^ r (r)A^ r (f) are upper bounded. 



In fact, they are given by Y7t=i 4> {^ 7rl! jr^ T k 



.n-l ~ (t-l)T c N 



for n = 1, . . . , N. Therefore, the limit eigenvalue 



distribution of the matrix A^ r (r) A^ jr .(r) has upper bounded support A M ax- Then, by appealing to Lemma 

5 Note that n = n(N) is also a function of the matrix size N. 
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9 in part I HI with p = 4 and by making use of the bound for any Hermitian matrix C E C NxN , (trC) 2 < 
7Vtr(C 2 ) we obtain 

12 



Ci = E 



W\ 2 s* A^ r (r k )A^(r k )s k - ^tr(Aj r (r fe )A , r (r fc )) 



< 



^|^tr(Aj r (r fc )A,, r (r fe )) 4 



if 4 K| 4 A4 

- jy2 ^MAX- 



Since | ct^ | < omax < +oo, the Bienayme inequality yields Ve: > 



Pr 



iifcfc - ^tr(Aj r (r fc )A^(r fc )) 



N 



E 



< 



- ^tr(Aj r (r fc )A^(r fc )) 



^ 4 |a fc | 4 A 4 



MAX 



Thanks to the bound (Bgb Ve: > 



Pr{|B fcfc -i2i(|a fc | 2 ,75t)| <o(N~ 2 ). 



(46) 



Furthermore, appealing to the Borel Cantelli lemma (see e.g. 11271 ), this bound implies the following 
almost sure convergence. 



Ri(\,T)\ {x , T )=(\a k \2,T k ) = lim Rkk 

K =pJ\ — >oo 



lim 



\ak\ 



tr(A£ r (T fc )A , r (r fc )) 



K=f3N^oo N 



a r 27T 

/ A ?r(^7-)A ,,(x,r)dx 
^ Jo 



Let us now consider the block matrix Tr„ n i whose (it, u) element (Tr^i^,, is given by 



(T"[nn])nD = ^ n AV n ,u^n,v A H & H 

Thanks to the assumption of Theorem Q] that the support of F\ A \2 T (X, r) is bounded and r) is bounded 
in absolute value, the diagonal elements of the diagonal matrix AW niU V^ v A H are upper bounded in absolute 



(47) 
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value by a positive constant T M ax- Then, by appealing to Lemma 9 in part I Q]| we obtain 



E 



< ^±tr(AV n , u V» v A 



(48) 



By appealing again to the Bienayme inequality and by making use of the bound (l48l) we obtain \fe > 



Pr 



.T[ nn \)u tV tr(AVn jU V n V A t 



N 



> e> < — E 



tv(AV, IJI V , ,! J .A" 



< 



s 1 

K T 4 
- n -4- t MAX 

£ 4 iV 2 ' 



iV 



(49) 



Thus, the following convergence in probability holds 



lim (Tr. 

K=f3N ->oo 1 



K=/3N->oo N 

lim — 

K=0N-*x> K 



lim ^trAV n , u V^A H 



k=l 



N 



(3 I Xcf>[n,T-—T c 



r 

v-l 



n—1 ~ v—1 

Z7T , Tb T 

N ' k r L 



r 



tt,r T c dFi A |2 jT (A,r 



(50) 



with f2 = limAT^oo 2irj^ and < < 2ix. Therefore, the block matrix T[ nn ] converges in probability and in 
mean square sense to the r x r matrix 



Ti(fi) = lim T [nn] 



P J AA , r (fi,r)Aj r (fi,r)d J F| A | 2 , T (A ! r) 



with < Q < 2tt. Thanks to the bound (@8]) for large K = (3N and \Je > the bound 



Pr 



T(Q)) U , V <e\< o(N~ 2 ) 



holds. Making use of this bound and applying the Borel Cantelli lemma the almost sure convergence is also 
proven. This concludes the proof of the first step. 
Step I: 

By appealing to the induction assumptions, i.e. the almost sure convergence of the diagonal elements of 
R and of the diagonal r x r blocks of T , for s = 1, ...,£ — 1, we prove that the following almost sure 
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tr AVn )U -R|= n V? w A 



lim 

K=(3N^oo 



N 



32 



v^k*-! 2 / n — u— 1 \ „/ n — 1 _ f— 1 



r 







J A0 (n, 



u — 1 
r T r 



ft, r - - — -T c ) i? s (A, r)cLF| A |2 iT (A, r) 



(51) 



with ft = liniTv^oo 27T 22 — s = 1, . . .1 — 1 and 



R s (\,T)\ {XiT)=ilakl 2^ k) = lim (R ) kk + o(iV 2 ) 

K =pN — »oo 



(52) 



as from the recursion assumptions. Furthermore, we prove the following almost sure convergence 



A ^ 



, lim ^trA^T^A^fo) = lim M- ^(Aj r ft)) m (T ) m (A, r (?,)) 

K=pN— >00 iV K=HN— >rvi /V ' » 



K=/3N^oo N 



n=l 



2tt 



Aj r (fi,r)T s (fi)A^(fi,r)dl] 



(A,r)=(|a,P,r fc ) 



with s = 1, 



1 and 



T s (ft) = lim (T )„„. 



In fact, for (I5TT) we can write 



(53) 



(54) 



C 2 = Pr 



-txAV n , u % n Vl v A H 



1 1 2 fr n — l^. u—\ \ f v — 1 \ i2~\ 



> e 



< C2a + C: 



26 



where 
and 

U = Pr 
Note that 



(2a = Pr 



s ^ s 



tTAV n , u (R" - K n )V»A H 



N 



> 



k=l \ 



r c ((-R )kk ~ R s {\a k \ 2 ,T k ) 



(2a < Pr 



1 , da s 
-tr(R - # 



> 



2/3 
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H- 



The expansion of the matrix R = (R\= n + 8 n S n ) s yields 



tvR = trR 



s-l 



(to,!l,..i s _l) 



ll, • • Yl fin R ^n$r 

u=0 



io+Ej=l(i+ 1 ) i j= s o 



where tp(io, ii, . . . i a -i) < 2 s is the number of the terms of the expansion of R whose trace equals 

n:= 



S n R^Jn • Then, 



(io,n,---*s-i) 

«o+Ei=l(i+ 1 )*j= s o 



/^MAX^MAX^ 



s+l 



Thanks to Property B on the convergence in probability, £ 2a converges in probability with rate o(N 2 t ) 
at worst, i.e. Ve > 0, 



In fact, for e' 



Pr 



lim Pr 

K=f3N^oo 



/32 s + 1 "max^max 



N 



> 



< o 



MAXVMAX 



N 2 



(55) 



H 



N 



I 5-1 



u=0 

<E Pr 

u=0 

M=0 



N 



A 



(c) 

< 



(56) 



where inequality (a) holds for iV sufficiently large, inequality (b) follows from the Bienayme inequality, and 
inequality (c) is a consequence of Lemma 9 in part I (TJ and the bound on the eigenvalues moments of the 
matrix R. 

Let us consider now the probability C 2 &, 



l k=l 



< Pr <{ max|(# ) kk - R s (\a k \ 2 , r k )\ > 



''MAXVMAX 



/^MAX^MAX 



(57) 
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for s = 1, . . .£ — 1. Thanks to the assumption of the recursive step that Ve' > and large K = (3N, 
Pr{\(R) kk — R s (\a k \ 2 ,r k )\ > e'} < o(N~ 2 ), ( 2 b — ► o(N~ 2 ), i.e. it vanishes asymptotically as N, K — > oo 
with constant ratio with the same converge rate as o(N~ 2 ) at worst. Therefore, (|5"TT) converges in probability 
with a rate as o(N~ 2 ) for N — > +oo, at worst. This convergence rate enables the application of the Borel- 
Cantelli lemma to prove that (l5"TT) converges almost surely. 

The proof of the convergence (1531) with probability one follows along similar lines. 

■r-i, /~*\ ^ l 

Following the same approach as in the proof of Theorem 1 in [2J, we can expand (it ) kk and T, nn -> as 

follows: 

t-\ 

{R) kk = KT^ l h k {R) kk £=1,2,... (58) 

s=0 

t-i 



T [nn] = 2^ ° nR ^n °n T [nn] • £=1,2,... (59) 

s=0 



^0 -~0 

being T and R the identity matrices of dimensions rN x r N and K x K, respectively. 

Thanks to Property A and Property B of the convergence in probability of random sequences and the 
induction assumptions, the convergence in probability one of the sequences {(R ) kk } and {T\ nn i} reduces 
to the following two steps. First we show the convergence in probability of h k T^ k h k and o n R ]=n o n to a 
deterministic limit, respectively. Then, we show that the convergence holds with an appropriate convergence 
rate which enables the application of the Borel Cantelli lemma. Let us define 

I 1 2 

Cs = h H k Tl k h k - ^trAj r (r fc )Tl fc A^ r (r fc ). 

Lemma 9 in part I HJ applied to the quadratic form h k T^ k h k with p = 4 yields 

KAa k 



E Ksl 4 < ^f^E (tr(A* (t^A^t,))' 



< ^<Ax0MAxtr(Tl;). (60) 

Thanks to the bound on the eigenvalues moments of the matrix T, lim K=/3N _ 00 -^E(trT^ fc ) is almost sure 
upper bounded Vs as N = f3K — > +oo. Therefore, E|£ 3 | — > as iV — > oo with ^ — > /3 and h k T^ k h k 
converges in mean square sense, and thus in probability. Furthermore, the Bienayme inequality implies that 
Pr{|C 3 | > e} < o(N~ 2 ) as iV -> +oo. Thanks to ([53]) 



lim ^trAl(h)Tl k A^ r (T k ) = A /** A J r (fi, r)T s (Q)A^ r (Q, r)dQ 



2vr 

+ o(iV" 2 ) 

(X,T)=(\a k \ 2 ,r k ) 

g(T s ,X,r) + o(N- 2 ). (61) 
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then 

Pr{\h*Tl k h k -g(T s ,\,T)\>e}^o(N- 2 ) (62) 

thanks to property A. Thanks to the convergence rate in (|62l and the Borel Cantelli lemma, the almost sure 
convergence (l52l) follows. 

^e 

The convergence with probability one of the diagonal blocks T, nn i can be proven in a similar way. More 
specifically, it can be shown that the r x r block o n R^ n o n converges to the r x r deterministic matrix 



f(R a ,Q)=P J AA (/vr (fi,r)A (/)ir (fi,r) H J R s (A,r)dF| A |2 iT (A,r). (63) 

suchthatPr{|(tf n )«^n(*f)t,-(f(i2„n))« I «| > ^} ^ o(N~ 2 ). 

Finally, by making use of equations (1581) and ( |59| ) and the definitions (|52~1) . (|54|) . (1631) . and (I6TI) we obtain 



ifc(A,r) =^g(T t - a -. lt \,T)R.(\ t T) £=1,2,... (64) 

and 

e-i 

T e (Q) = i(Re-s-i, n)T 8 (Q) £ = 1,2,.... (65) 

with g(T s , A, r) and f (R s , Q) given in ( |6TT) and (|63l) , respectively. Consistently to the definitions of T and 
R , T (Q) = I r , being I r the r x r identity matrix and Rq(X) = 1. 

Then.^^r) = £ Aj r (ft, r)A^.(Q, r)dQ and /(T 0j fi) =/5/AA , r (fi,r) A J r (fi,r)dF| A | 2 , T (A,r) 
and (l64l) and (1651) reduce to the asymptotic limits i?i(A, r) and Ti(fi) already derived in step 1. Therefore, 
we can begin the recursion with £ = 0, Ro(X, r) = 1 and T (Q) = I r . 

Properties A, B, and C, the induction assumptions, relations (1581) and (|64|) . the convergence rates Q 2 — > 
o(iV~ 2 ), Pr{C 3 > e} <^ o(iV^ 2 ), and the Borel Cantelli lemma yield (|44l) . The proof of (g5]) follows 
immediately along similar lines. 

This concludes the proof of Theorem CD 
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Appendix II 
Proof of Corollary \T\ 

Corollary \T\ is derived by specializing Theorem Q] to a unitary Fourier transform $(u;) with bandwidth 
B < 2§r. Let us recall here that the unitary Fourier transform in the discrete time domain is given by 

sign(n) [ij 



0(fi,r) = -e^ 



S =-sign(n) 

The matrix Q(fi, r) = A^ r (f2, r) A^ jr .(f2, r)^, with A^ ir (Q,r) defined in (T26l) . can be decomposed as 
Q(fi, r) = Q(fi) + Q(Q, t) with the elements of Q(Q) and Q(fi, r) defined by 



c . / * — \ \ i i ^ c / 



^2 

C S =-sign(n) L^J 



-i^(n+^*) for IQKtt, 



(67) 



and 



sign(H) [§J 

(Q(n, r)) M = -1 ^ $ ( Q+ y u \ f ^ + 2?r ^ e -^(.-«) e -i(*=i(n- a r.)-^( n -a™,) 



S ,«=- S ign(Q) 



r, 



for |Q| < 7r, (68) 

respectively. 

Equations (1241) and (|25l) can be rewritten as 

f(R.,n)=0Q(n) j \R s {\t)&F\ A? , t {\t) 

+ P [ \R s (\,T)Q(tt,T)dF lAl 2 >T (\, T ), -n<n<n (69) 

7T 

(70) 

respectively. If the conditions of Corollary Q] are satisfied, i.e. if B < and r is uniformly distributed in 
[0, T c ] , it can be shown that 

• Re(\, r), £ e Z + , are independent of r and 

• TV(f2) is a matrix of the form (|7TI) 



<7(T s ,A,t) = A [ T tr(T s (Q)Q(Q))dQ + A tv(T s (n)Q(n,T))dn, 



bo bie^ 



6 r _ie- 



6 r _ie" 



(r-l), 



. (r-2) „ 



(71) 



feie 5 



(r-l), 



fe r _ie" 



/>,-» 
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being b = b (Q), b\ = bi(Q), . . . 6 r _i = & r _i(f2), eventually functions of f2. 
These properties can be proven by strong induction. It is straightforward to verify that they are satisfied 
for s = 0. In fact, -Ro(^) r) = 1 is independent of r and T (fi) = J is of the form (1711) with b — 1 and 
= with i = 1, ... r — 1. By appealing to Lemma 1 in part I [1] Appendix I tr(Q(fi, r)) = and 
g(T , A, r) = ^ tr(Q(f2))dll Hence, g(T , A, r) is independent of r. 
The induction step is proven using the following induction assumptions: 
• For s = 0, 1, . . .£ — 1, -R S (A, r) is independent of r; 
. For s = 0, 1, . . .£ - 1, T a (fi) is of the form ([ZD- 
Thanks to the form (1711) of T s (f2), s = 1, . . . I — 1, given by the induction assumptions and by applying 
Lemma I in part I Appendix I we have tr(T s (f2)Q(f2, rj) = 0, for s = 0, 1, ...,£— 1. Then, (1701) reduces 
to#(T s ,A,T) = ^/^tr (T a (Q)Q(n))dQ and g(T s ,X,r) is independent of r for s = 0, 1, . . . , £ — 1. 
Therefore, all quantities that appear in the right hand side of ((22)) are independent of r and i?£(A, r) is 
also independent of r. In the following we will shortly write Re(\) and g(T s , A) instead of Ri(X, r) and 
g(T s , A, r). Thanks to the fact that (i) -R S (A, r) is independent of r and (ii) A and r are statistically indepen- 
dent with t uniformly distributed, (f69b can be rewritten as 

f(R s , Q)=f3 j A J R s (A)d J F] A | 2 (q(Q) + 1 jT" Q(Q, r)dr^) . (72) 

It is straightforward to verify that J Tc Q(fi, r)dr = from the definition of Q(fi, r) in (1681) . Then, 

f(R„n) = PQ(£l) J A J R s (A)dF| Ap (A) 

= f(R a )Q(Q) (73) 

with /(i? s ) = f3j XR s {X)dF lAl 2(X). Substituting (|73]) in (f23]) yields 

t-i 

T e (Q) = f{Rt-.-i)Q(n)T,(n), -7i<Q<7i. (74) 

s=0 

Since T s (Cl) is of form (1711) . the conditions of Lemma 2 in part I Appendix I are satisfied for B = T s (fi). 
This implies that Q(£Y)T S (Q) is also of the form (TTTb . Since TV(f2) is a linear combination of matrices of 
the form (1711) . TV(f2) is also a matrix of the form (17X1) . Then, the statement of the strong induction is proven. 
Thanks to the properties shown by strong induction, the recursive equations in Theorem (QQ) reduce to the 
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(75) 

(76) 

(77) 
(78) 



following set of recursive equations: 

i-i 

s=0 

e-i 

T e {Q) = f{Re-s-i)Q{£l)T 8 {n) -7t<Q<7i 

s=0 

f(R a ) = P J A J R s (A)d J F| A | 2 (A), 

g(T 3 ,X) = A f tr(T s (tt)Q(fi))cm 

with T (ft) = I T and i? (A) = I. 

Then, applying again Theorem \T\ we obtain the following convergence with probability one 

^ t 

lim (il ) fcfc = i? £ (A)| A=K p. 

From (l76l) and T (fi) = J r it is apparent that T^(f2) is a polynomial in Q s (f2), for s = 0, 1, . . . I. Then, 
Ti(Q) has the same eigenvectors as Q(Q) and it can written as Tt(Q) = U(il)A£(il)U H (il) where A^(f2) 
is a diagonal matrix with diagonal elements t^i, t i 2 , ■ ■ ■ te, r and 

r — 1 



U(Q) = [e[n- sign(fi)27r 



with e (fi) r-dimensional column vector defined by 



. . . e (ft) . . . e m + sign(ft)2?r 



(79) 



e r 



By making use of the eigenvalue decomposition of the matrix QiVt) in part I Appendix I Lemma 3 the matrix 
equation (1761 ) reduces to r scalar equations 

£-1 



I — -sign(ft) — 



r - 1 



u + 1 



is,«(fi) u=l,...r and |0| < 71". 



By substituting y = Q — sign(f2)27r ([^J —u+l) for \Q\ < n we obtain 



tt u [y + 2tt 



r — 1 



» +1 )) = |^%l*fe 



t s „ 7/ + 27T 



r — 1 



u + l 



(80) 



for < y + 2tt ( - u + l) < tt and 

^ + 0)-£ /w ~#( 



t SjU [y - 27r 



r — 1 



u+l 



(81) 
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for — 7r < y — 2ir ( — u + l) < 0. Then, for u = 1, . . . r, the r functions (|80l) and ( f8~TT) defined in 
not overlapping intervals in [— 2irr, 2ixr\ can be combined in a unique scalar functions T^(y) in the interval 
\y\ < 2-nr satisfying the recursive equation 

2 



s=0 



-8-1 j 



Similar arguments applied to (1781) yield 



2^ 



7^2 

r7r c 



^(y) 



r, 



dy. 



The substitutions uj = and T^(luT c ) = Ti(oj) yield to the recursive equations in CorollaryQ] 
This concludes the derivation of Corollary \T\ from Theorem [T] 



A 
tions 



Appendix III 
Derivation of Algorithm Q] 
porithmdlcan be derived from the recursive equations of Corollary \T\ by using the following substitu- 



A 

R s (\) 

E(XR S (X)) = ±f(R s ) 

Y l $ HI 2 



r 



\<$>{uj)\ 2 T s {u) 



2ttB 



|$(o;)| 2 T s (a;)da; 



Ps{z) 

v s {z) 

V s 

y 

u s (y) 



2ixT c J_ 2nB 

Then, the initial step is obtained by defining Ho(y) = 1 and po(z) = 1. The recursive equations in step t 
are obtained by using the previous substitutions. In order to derive U s let us observe that ^- |<£> {uj)\ 2 T s (uj) 
is a polynomial in y = ^- 1$ (u) \ 2 of degree s + 1. Then, ?7 S is a linear combination of f?- where 



£ 



"'n-l 



2tt_B 



|$MI 2n du; 



27TT" - 

5 Note that the substitution of A with z is redundant. It is used to obtain polynomials in the commonly used variable z. 
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The coefficients of the linear combination are obtained by expanding u s (y) as a polynomial in y. 

We conclude the derivation of Algorithm Q] by summarizing the previous considerations and substitu- 
tions: 



Pi( z ) = y^zUi-s-iPs(z) 

s=0 



s=0 

• U s and V s are obtained from u s (y) = yp s (y) and v s (z) = zp s (z), respectively by 

- expanding u s (y) and v s {z) as polynomials in y and z, respectively, 

- replacing the monomials y n and z n , n G Z + with |? and m^L, respectively. 

Then, Re(X) = pe(X) and the eigenvalue moment rrVjl = E{_R^(A)} is obtained by replacing all monomials 
in the polynomial p £ (z) by the moments rn^ 2 , wj^p, • • • , m \A\^^ respectively. 



z, z , . . . , z 



Appendix IV 
Proof of Theorem [2] 

The proof of Theorem [2] follows along the line of the proof of Theorem [T] As in the proof of Theorem 1, 
we can focus on the spreading matrix S in (|39| ) and the autocorrelation R. 
For a signal with bandwidth B < 



and t) = — 2ir [^J , r) for any f2. Correspondingly, we define 

1 { Q\ i-rn 
A^(fi,r) = -$(-)e-^e(fi), 

with e(f2) = (l,e J ~ ; . . .e j ~^~ Q ) and 















IT 





for any fl 

We adopt here the same notation as in the proof of Theorem [Q Then, the K x K diagonal matrix V nt , for 
£ = 1, . . . r and n = 1, . . . N is given by 

1 / j2,TV \ j27rn(t-l) / jgfsfl j27rn.T 2 j2irnj K 

V nt = — $* e »■ diag e T = , e T = , . . . e T = 
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with n = ^jf-— and A^ r (T fc ) is the rN x N block diagonal matrix with n diagonal block A^ r (n, r k ). 

We develop the proof by strong induction as in Theorem \T\ with similar initial step and similar induction step. 
Step 1: In this case 



Rkk = \a k \ 2 s%A^ r (T k )A^ r (r k )s k = \a k \ 2 s% $s k 

where $ is a matrix independent of r k and the n th element is given by 3> n „ = £ 
By following the same approach as in Theorem \T\it results \/e > 

|2 



Pr 



R 



kk 



r\a k \ 



E 



n=0 





2 











N 2 e 4 



being A M ax = max ne[ _ 7ri7r ] $ ^ 
^(A)| 



and 



lim 



N-l 

£ 



£=0 



2^ 



2vr / n 

Y c (n 



2n 

TV 



(82) 



Furthermore, as in Theorem[[] it can be shown that Pr j \R kk — Ri(\a k \ 2 )\ > e \ < o(N 2 ) with the con- 
sequent convergence with probability one by the Borel Cantelli lemma 



lim R kk =' Ri(\a k \ 2 ] 

K=f3N^+oo 



A=|a fc | 2 " 



Similarly, (Tr ren i) w , the (u, w)-element of the matrix Tr ren i is given by 

n 

Inn 



T\nn] — "'n^ V niU V n „ A (7 n 



1 

T. 



(83) 



As in Theorem [T]it can been shown that 

~ , 1 



Pr 



27m 

~t7 



K/T? 



4^MAX 

N 2 e 4 



with T M ax = ( maxnet-fl-.w] 
holds 



lim (T [m A 



(sup^ maxfc |afc| 2 ) and the following convergence in probability 





K=l3N^oo 



lim 

K=f3N^oo T C K 



K 



2 E 



a k \ 



k=i 
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with O = 2n liniA^oo n and \Vt\ < n. Thus, the diagonal block converges in probability as follows 



Ti(n)= lim (T M ) W 

K =pN — >oo 



v l_ 



XdF lAl 2(X)e(n)e H (n) 



(84) 



Furthermore, 

Pr {|(T M ) m , - (Ti(ft)) w | > < o(iV- 2 ). 

Then, the convergence in probability (l84l) holds also with probability one by the Borel Cantelli lemma. This 
concludes the first step of the induction. 
Step I: Let us observe that 



0! = -tr AV n , u R^ u A H 



u — v 



N 



|Qfc| 
fc=i c 



E 



and 



|Qfc| 
AT 



-trAj r (r fc )Tl fc A» r (r fc ) 



? 1 



|Qfc| 

n=l 



27m 



e w (27rn)(Tl A; ) n „e(2 7 rn). 



By following the same approach as in Theorem Q] it can be shown that 0i and 2 converge almost surely 
to the following limits 



lim 0i = -^-e 

K=l3N^oo T c 2 



-j2irn- 



Ai? s (A)dF, A | 2 (A) 



and 



lim 02 = — 

K=0N^oo 27rT c 2 ,, 



e H (Q)T s {Q)e{Q)dQ 



A=|a fc p 



with i? s (A)| A= | afc | 2 = lim x=/3A r^ 00 (i? ) fefe and T s (fi)| = limK^/v^oo T [nn] given by the recursion assump- 
tions. 

Additionally, it can be shown that the following almost sure convergence holds 

h - 



#(T s ,A)| A=|ofe|2 =_lim T^/i fe 



K=0N^oc 

A 



2ttT, 



e^(n)T s (fi)e(fi)dfi 



(85) 



A=|o fc |= 
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and 



f(Rs, n)= lim 6 n R^J n 

K=pN — >oo 



J^2 



e(Q)e H (Q) I Ai? s (A)dF| A | 2 (A) 



Furthermore, the convergence satisfies the bounds 



Pr \h k T^ k h k - g(T a , \a k \ 2 )\ > e\ < o(N~ 2 ) 



and 



Pr | (S n ) u R^ n (S n ) v -(f(R s , Q)) u , v \ >e\ < o(N 



T-2-\ 



for large N and Ve. 

The recursion assumptions and the limits (1851) and (f8~6l) in (1581) and (l59l) yield 



e-i 



s=0 
£-1 



s=0 



2vrT c 2 „ 



tr (T s (n)e(n)e H (n)) dn 



A=|a fe |2 



and 



s=0 



e-i 



7^2 

c 



A J R s (A)dF| A | 2 (A) e{Q)e H (Q)T s (Q) 



(86) 



(87) 



(88) 



where Rq(X) = 1 and T (fi) = J r . With a similar approach as in Theorem[T]it can be proven that for large 
N and Ve > 



Pr{|^ fe -it;,(|a fe | 2 )| >e}<o(N 



and 



Pr 



(T M ) W - (Te(Q)) uv >ej< o(N~ 2 ). 

In contrast to Theorem \T\ the recursive equations (1871) . (l88l) . (f8~5l) . and (l86l) are independent of the time 
delay r k . 

The recursive equations can be further simplified by observing that (e(Q)e H (f2)) m = r m ~ 1 e(Q)e H (Q). 
Then, it is straightforward to verify by recursion that the matrix T s (f2), s = 1, 2, ...,£— 1, is proportional 
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to the matrix e(Q)e H (Vl) and we can express it as T s (fi) = T s (Vl)e(Q)e H (£1), s = 1,2, ... . Thus, the 
recursive equations can be rewritten as 



i-i 



Rt(X) = ^< 7 (T i _ a _ 1 ,A) J R a (A) 

s=0 
l-\ 

T e (Q)e(Q)e H (Q) = J2f(Ri^i,n)T s (Q)e(Q)e H (Q) + f{R t . 1 ,n)T (n) 



1,2, 



f(R a ,n) = f(R a , n)e(Q)e H (Q) 

2 r 

XR s (X)dF ]A] ,(X) 



(89) 
(90) 







T, 



-7T < Q < 7T 



^(T S ,A) 



-*k r $(- 



s = 0. 



with T (ft) = Ir and R (X) = 1. 
Substituting (O in ([89]) we obtain 



£-1 



Tt(fl)e{Q)e H {Q) = Yf{R e ^ l ,Q)T s (n)e(n)e H (Q)e(Q)e H {Q) + f(R^ 1 ,Q)T {Q)e{Q)e H (fl) 

s=l 

= r ^ f( R e- s -i, n)T s (n)e(n)e H (tt) + f(R^ 1: n)T^(n)e(n)e H (fi) (91) 

3=1 

Recalling that T (Q) = I r and defining Tq(Q) = ±, we obtain from (HB the scalar T e (fl): 



(92) 



,8=1 



The following equations summarize the recursion in terms of only scalar functions. 



R t (X) = ^g(T t - 8 - 1 ,X)R s {X) 



s=0 



T e (n) = rJ2f(R e ^ l ,n)T s (n) 



f(R s ,n) 
g(T s ,x) 



c 

r 2 A 



2vrT c 2 „ 



A J R s (A)d^| A | 2 (A) 



T s (fi)dfi 



Ixl < 7T 



s = 0,l, 



with T (f2) = -y and i?o(A) = 1. Let us observe that the different expressions of g(T s , X) for s — 0, 1, . . . 
could be absorbed in a unified expression by initialize the recursion with T (f2) = instead of using 

Tom = i 
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The recursion in the statement of Theorem [2] is obtained by defining 

f(R s ) = J XR s (X)dF lA? (X) 

and 

and by expressing Re(\) and Ti(uj) as recursive functions of f(R s ) and ^(T s ). 
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